INAOE's Participation at PAN'15: Author Profiling task

نویسندگان

  • Miguel Ángel Álvarez Carmona
  • Adrián Pastor López-Monroy
  • Manuel Montes-y-Gómez
  • Luis Villaseñor Pineda
  • Hugo Jair Escalante
چکیده

In this paper, we describe the participation of the Language Technologies Lab of INAOE at PAN 2015. According to the Author Profiling (AP) literature. In this paper we take such discriminative and descriptive information into a new higher level exploiting a combination of discriminative and descriptive representations. For this we use dimensionality reduction techniques on the top of typical discriminative and descriptive textual features for AP task. The main idea is that each representation, using the full feature space, automatically highlights the different stylistic and thematic properties in the documents. Specifically, we propose the joint use of Second Order Attributes (SOA) and Latent Semantic Analysis (LSA) techniques to highlight discriminative and descriptive properties respectively. In order to evaluate our approach, we compare our proposal against a standard Bag-of-Words (BOW), SOA and LSA representations using the PAN 2015 corpus for AP. Experimental results in AP show that the combination of SOA and LSA outperforms the BOW and each individual representation, which gives evidence of its usefulness to predict gender, age and personality profiles. More importantly, according to the PAN 2015 evaluation, the proposed approach are in the top 3 positions in every dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

INAOE's Participation at PAN'13: Author Profiling Task Notebook for PAN at CLEF 2013

This paper describes the participation of the Laboratory of Language Technologies of INAOE at PAN 2013 evaluation lab. We adopted second order representations for facing the problem of Author Profiling (AP). This representation tackles two shortcomings of the typical Bag-of-Terms: i) the sparsity and high dimensionality of document representations, and ii) the assumption of total independence b...

متن کامل

Style-based Distance Features for Author Verification Notebook for PAN at CLEF 2013

In this paper we present the approach we took in our participation to the PAN 2013 Author Profiling task. It is an adaptation of our system submitted for author identification, assuming that a profile category (authors belonging to the same gender and age group categories) can be analyzed in the same way as an author’s style.

متن کامل

INSA LYON and UNI PASSAU's Participation at PAN@CLEF'17: Author Profiling task

This paper describes the participation of INSA Lyon and UNI Passau at the PAN 2017 Author Profiling task. Given the language and tweets from an author, the goal is to predict his/her gender and language variety. We consider two strategies : a "loose" classification that learns one predictive model for the gender and another one for the variety, and a "successive" classification that first predi...

متن کامل

Style-based Distance Features for Author Profiling Notebook for PAN at CLEF 2013

In this paper we present the approach we took in our participation to the PAN 2013 Author Identification task. It relies on a complex process to select the features which represent the author’s writing, using potentially multiple statistics and distance measures computed from the training set.

متن کامل

Using Intra-Profile Information for Author Profiling

In this paper we describe the participation of the Laboratory of Language Technologies of INAOE at PAN 2014. We address the Author Profiling (AP) task finding and exploiting relationships among terms, documents, profiles and subprofiles. Our approach uses the idea of second order attributes (a lowdimensional and dense document representation) [4], but goes beyond incorporating information among...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015